An analysis of morphological and genetic diversity of mango fruit flies in Pakistan

Fruit flies of genus Bactrocera are important insect pests of commercially cultivated mangos in Pakistan limiting its successful production in the country. Despite the economic risk, the genetic diversity and population dynamics of this pest have remained unexplored. This study aimed to morphologically identify Bactrocera species infesting Mango in major production areas of the country and to confirm the results with insect DNA barcode techniques. Infested mango fruits from the crop of 2022, were collected from 46 locations of 11major production districts of Punjab and Sindh provinces, and first-generation flies were obtained in the laboratory. All 10,653 first generation flies were morphologically identified as two species of Bactrocera; dorsalis and zonata showing geography-based relative abundance in the two provinces; Punjab and Sindh. Morphological identification was confirmed by mitochondrial cytochrome oxidase gene subunit I (mt-COI) based DNA barcoding. Genetic analysis of mtCOI gene region of 61 selected specimens by the presence of two definite clusters and reliable intraspecific distances validated the results of morphological identification. This study by morphological identification of a large number of fruit fly specimens from the fields across Pakistan validated by insect DNA barcode reports two species of Bactrocera infesting mango in the country.


Introduction
Fruit flies of the genus Bactrocera (Diptera: Tephrtitidae) with more than 5000 species are among the most important pests of fruits and vegetables in the world [1].In addition to the polyphagous nature of some species, several are considered highly invasive; aided by globalization of trade and poor quarantine infrastructure in the developing countries.Adults often exhibit a strong tendency for dispersal and the immature stages are readily transported to new areas via fruit movement [2].The direct damage reported by these flies is from 30 to 80% depending on the fruit variety, season, and location [3], resulting in annual losses worth billions of dollars.The cost includes both infestation and management techniques [4].
Taxonomy that utilizes morphological identification has been a gold standard for insect identification for decades [5].However, this conventional approach is challenged by availability of taxonomic experts and keys specific to insect species, sample handling, different stages of insect metamorphosis, change in morphological traits by host adaptability and most importantly presence of hybrid species and existence of some insects as species complex [6].DNA barcoding for insect identification including the use of mt-COI gene regions is introduced relatively recently as a parallel approach to morphology based taxonomy.It is a short standardized sequence of mt-COI gene that easily amplified by a universal set of primers and resulting sequence is able to provide a higher sequence variation at inter and intra species level [7,8].
From Pakistan eighteen species of fruit flies are morphologically characterized from different fruits and vegetables [9][10][11][12][13][14][15] and there are very limited reports on genetic diversity [16,17].Similarly not much is known about the diversity and geographical distribution of fruit flies infesting Mango crop in the country [16,17].
In recent years, Pakistan has become a popular country in the production of different varieties of mango fruits and currently is the second largest mango-producing country [11,12,18].Fruit flies are the greatest enemies of the mango fruit in Pakistan and studies from limited samples and a few areas show the presence of B. dorsalis and B. zonata from Mangos in the country [17,19,20].
This project was designed to better understand the morphological and genetic diversity of fruit flies infesting Mangos in the fields of all major Mango growing districts across Pakistan in the summer of 2022, by collecting infested fruits from the fields, rearing of fruit flies in the lab to obtain first generation and their morphological characterization validated by genetic analysis of mt-COI DNA sequence analysis.

Collection of fruit fly-infested mango samples
A team of CAB International surveyed 46 locations of 11 major Mango cultivating districts of Pakistan in summer of year 2022 from June-July.In the Punjab province 24 locations in 5 districts; Khanewal, Multan, Muzaffargarh, Rahimyar khan and Bahawalpur were surveyed [Table 1] and in the Sindh province 22 locations of 6 districts; Mirpur Khas, Sangarh, Matiari, Umarkot, Tando Allahyar and Khairpur were surveyed [Table 2].

Lab procedures to obtain first generation of fruit flies
The infested mango samples were kept in rearing room of the biological control laboratory of CAB International, with conditions of 25 ± 2 ˚C, 50 ± 10% RH, and a 12:12 (L:D) photoperiod.Fruit fly metamorphosis took 12-18 days and flies were observed daily for the hatching and pupation.Upon hatching, mature larvae freely left the mango fruit for pupation into a 10-15cm deep layer of moist (5-8% water) sand [21].After pupation, the puparia were separated from the sand medium by a sieving method.The emergence of fruit flies started 4-10 days after pupation.

Morphological description of fruit flies
The taxonomic keys of Drew and Hancock and Zubair and colleagues were used for the morphological identification [15,22] of all 10,653 specimens of first-generation flies using stereomicroscope (Nikon SMZ1000).B. dorsalis have a clear T-shaped pattern on the abdomen, their thorax is partly black and has two yellow streaks, and their body length is 6.5mm to 7.0mm while their wing has dark margins and dark anal streak.In the case of B. zonata the T-shaped pattern is absent in their abdomen, their thorax is light brown in color with two yellows streaks, their body length is 5.5 to 6.0 mm [15] while their wing have no darks margins and anal streaks.

DNA extraction from fruit flies
For total DNA extraction from a single fruit fly Cetyl trimethyl ammonium bromide (CTAB) method was used as already reported by [16] with slight modifications.The head region of the fly was separated from the rest of the body.The head was ground with a micro-pestle in the bottom of a 1.5 mL centrifuge tube, containing 150μL lysis buffer.The lysis buffer contained 100 mM Tris-HCl, 1.4 M NaCl, 20 mM EDTA, 2% hexadecyltrimethylammonium bromide (CTAB) (Sigma-Aldrich, Darmstadt, Germany), and 7 μL of proteinase K (Fermentas).The lysate was incubated at 55˚C for 1 hour.150μL of chloroform: isoamyl-alcohol (24:1) was added, the contents were mixed by inversion, and the emulsion was separated by centrifugation, at 14,000 rpm for 10min.The DNA was precipitated by the addition of 150μL of 100% ethanol and 30μL of sodium acetate, followed by freezing at -20˚C for 2 hours.The pellet was collected by microcentrifugation at 14,000 rpm for 10 min, washed with 150 μL of 70% ethanol, centrifuged again at 14,000 rpm for 10 min then air dried, dissolved in 20 μL ddH2O, and stored at −20˚C.

Sequence analysis
First all the sequences were trimmed and aligned using BLAST in NCBI (https://blast.ncbi.nlm.nih.gov/Blast.cgi) to identify sequence similarities.All the sequences were truncated to the same length to eliminate missing data.The final 61 best-quality sequences of length 620 bp of mt-COI were submitted into NCBI genbank data base (accession no: OP804475-OP804502 and OP804145-OP804177.After accession numbers were assigned to our sequences genetic analysis was performed using MEGA ver.6.0 [24] and Sequence Demarcation Tool version 1.2 [25].Closely related sequences were retrieved from databases in FASTA format and aligned using Muscle implemented in MEGA ver.6.0 [24], The phylogenetic tree was constructed with default values while the bootstrap value for this tree was 1000 times replicates by maximum likelihood method [26].Liriomyza huidobrensis (AF327292) was used as an outgroup.To further evaluate the degree of genetic relatedness, color coded matrix and percentage pair-wise identity was generated using the Muscle algorithm in Sequence Demarcation Tool version 1. [25].  3 and 4).

Vallidation of morphological identification by analysis of mtCOI sequence diversity
The amplification of the mt-COI region of fruit flies was achieved from randomly selected single fruit flies from each sample and sequenced.BLAST analysis of 61 representative sequences showed its homology with mt-COI sequences of two respective species Bactrocera; B. dorsalis (India MN016995 and China MG689732 isolates) and B. zonata (Iran MG881714, MG881760 and China MG962410 isolates) confirming the morphological identification.Sequence data of two reference samples of B. dorsalis and B. zonata taken from the CAB International library were also included in analysis as control.
Inter and intraspecific genetic diversity among B. dorsalis and B. zonata specimens was also analysed using MEGA6.The intraspecific diversity among B. dorsalis was 0.011% and B. zonata was 0.016% indicating fewer variations in intraspecific diversity while the interspecific genetic diversity among these two species collected from Punjab and Sindh province were 0.117%.The software calculated interspecific and intraspecific genetic diversity by measuring the genetic differences (e.g., nucleotide substitutions) among these sequences and by calculating parameters like nucleotide diversity (π), which quantifies the average number of nucleotide differences between two sequences.
For the phylogenetic tree construction, no insertions, deletions, or stop codons were present in the alignment.These sequences were aligned using n-BLAST then sequences with accession no: OP804475-OP804502 showed 96-100% similarity with already stored sequences in the database while the sequences with accession no: OP804145-OP804177 showed 98-100% similarity.One closely related sequence of COI genes of both B. dorsalis and B. zonata were downloaded from databases in FASTA format and used in MEGA 6 software to analyze phylogenetic analysis by maximum likelihood method.Phylogenetic analysis in Fig 1 shows our sequences clustered in two groups: B. dorsalis and B. zonata.Where Liriomyza huidobrensis (AF327292) was used as an outgroup.

Matrix analysis of sequences
For the validation of phylogenetic analysis, matrix analysis of the sequences was also performed using the sequence demarcation tool (SDTv1.zonata reported from Iran is 97-98% Table 5.Table 6 indicates the fruit fly sequences showing the pair-wise sequence identity score of 100-96% among each other and with MN016995 B. dorsalis sequence reported from India is 100-90%.

Discussion
Morphological identification of insect pests is a classical and basic method of insect identification.This method has some limitations i.e. most of the economically important pests are difficult to identify via morphometric keys even by specialists as a large number of insect pests belong to morphologically cryptic species [27].Also, identifications of eggs and instars of pest species are difficult to determine morphologically [28].Compared to this, DNA barcoding provides a quick and authentic means for species identification [29].However, understanding the population structure and genetic diversity of insect-like fruit flies in any region requires the characterization of a large number of insects.The process of DNA barcode is expensive and the cost is a major challenge in determining the large population structures of these insects [29].Therefore, in this study, we collected infested mango samples across mango growing areas of Pakistan and first morphologically characterized all first-generation fruit flies.Then we randomly selected fruit flies from each location for mt-COI-based DNA barcode analysis and verified our morphometric data.This approach allowed us to provide barcode validation of morphologically characterized 10,653 fruit fly samples infesting commercially cultivated mangos in Pakistan.Dominance of B. zonata in Punjab is reported earlier from the country by small-scale studies from limited locations and samples [19].There are only three reports on very limited genetic diversity [9,16,17].Similarly, no report has been published before on diversity and geographical distribution of fruit flies infesting Mango crop in the country.Both these species are also reported as major pests of mango from Bangladesh and neighboring countries of Pakistan; Iran and India [30][31][32][33].
To ascertain genetic diversity, we used MEGA ver.6 that calculated interspecific and intraspecific genetic diversity by measuring the genetic differences (e.g., nucleotide substitutions) among these sequences and by calculating parameters like nucleotide diversity (π), which quantifies the average number of nucleotide differences between two sequences.The methodology involved quantifying genetic variation within and between species.Intraspecific genetic diversity to be 0.011% for B. dorsalis and 0.016% for B. zonata.These values point to a relatively limited degree of genetic variability within their respective conspecific populations.In contrast, the interspecific genetic diversity between these two species, as sampled from the Punjab and Sindh provinces, demonstrated a notably higher value, measuring 0.117%.This observation underscores the more pronounced genetic disparities that exist between these distinct species.In a comparable fashion, when the genetic sequences of B. zonata and B. dorsalis collected from both the province were subjected to a BLAST analysis, and when they were compared with sequences from neighboring countries, such as Iran and India.The results of this analysis revealed a striking degree of genetic similarity, with sequence identities of 97-98% for MG881760 and 100-90% for MN016995 respectively, thus highlighting the presence of discernible genetic variations among the fruit fly populations of Pakistan and its adjacent countries.
This study is the first of its kind in Pakistan endeavoring survey of the geographical locations across two major mango producing province of the country with the specific objectives of fruit fly identification and genetic analysis.The outcomes of this research offer a significant and novel contribution by furnishing crucial insights into the genetic diversity of this economically significant pest species to help in its pest management.

Conclusion
The morphology-based identification validated by mt-COI gene barcoding shows the presence of two species of Bactrocera in mango crop of year 2022 in Pakistan.Validation of morphological data by mt-COI gene analysis shows that in the absence of barcode technology morphologybased identification is an effective approach for screening adult fruit flies of Bactrocera genus.99 98.9 98.9 98.9 98.9 98.5 98.7 97.4 96.6 98.5 98.9 100 OP804487_*RYKL4B_Punjab 99.5 99.5 99.5 99.5 99.5 99.5 99.5 99.5 99.5 99.5 99.5 99.5 99.5 99.2 99.4 99.2 99.2 99.2 99.2 98.9 99 97.7 96.9 98.9 98.9 98.However, integrated barcoding, a combination of traditional taxonomy and molecular methods, enhances the accuracy and reliability of results.Overall, this study contributes important information on species diversity and genetic variation of Bactrocera on Mango crop in Pakistan.

2 )
as shown in Fig 2 representing two clusters.The pair-wise sequence identity score of all fruit fly sequences from Punjab province showed 100-89% similarity among each other while their similarity score with MG881760 B.

Fig 1 .
Fig 1. Phylogenetic tree showing the relationship among fruit flies based on mitochondrion cytochrome oxidase I (mt-COI) gene.The sequence Phylogenetic tree showing the relationship among fruit flies based on mitochondrion cytochrome oxidase I (mt-COI) gene.The sequence of AF327292 Liriomyza huidobrensis specie was used as out group.This tree was constructed using maximum likelihood tree on Mega 6. https://doi.org/10.1371/journal.pone.0304472.g001

Table 2 . Fruit fly infested mango variety and detailed location of each sampling area of different districts of Sindh Province.
https://doi.org/10.1371/journal.pone.0304472.t002 All 10,653 first generation of fruit flies specimens that emerged from infested Mangos from all locations and districts including 7804 from Punjab and 2849 from Sindh were morphologically identified as species of Bactrocera; dorsalis and zonata.The number of B. dorsalis from Punjab samples was 2690 and B. zonata were 5114.While the number of B. dorsalis from Sindh were 2693 and B. zonata 156 indicating dominance of B. dorsalis in Sindh and B. zonata in Punjab (Tables